Search CORE

181 research outputs found

Prediction of FAD interacting residues in a protein from its primary sequence using evolutionary information

Author: E Jeong
E Jeong
Gajendra PS Raghava
H Kaur
H Kaur
H Kaur
IB Kuznetsov
M Kumar
M Saito
N Bhardwaj
Nitish K Mishra
RA Bauer
S Ahmad
SF Altschul
T Joachims
V Sobolev
V Vapnik
W Li
Y Korllberg
Y Ofran
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Background: Flavin binding proteins (FBP) plays a critical role in several biological functions such as electron transport system (ETS). These flavoproteins contain very tightly bound, sometimes covalently, flavin adenine dinucleotide (FAD) or flavin mono nucleotide (FMN). The interaction between flavin nucleotide and amino acids of flavoprotein is essential for their functionality. Thus identification of FAD interacting residues in a FBP is an important step for understanding their function and mechanism. Results: In this study, we describe models developed for predicting FAD interacting residues using 15, 17 and 19 window pattern. Support vector machine (SVM) based models have been developed using binary pattern of amino acid sequence of protein and achieved maximum accuracy 69.65% with Mathew's Correlation Coefficient (MCC) 0.39 and Area Under Curve (AUC) 0.773. The performance of these models have been improved significantly from 69.65% to 82.86% with MCC 0.66 and AUC 0.904, when evolutionary information is used as input in SVM. The evolutionary information was generated in form of position specific score matrix (PSSM) profile by using PSI-BLAST at e-value 0.001. All models were developed on 198 non-redundant FAD binding protein chains containing 5172 FAD interacting residues and evaluated using fivefold cross-validation technique. Conclusion: This study suggests that evolutionary information of 17 amino acid patterns perform best for FAD interacting residues prediction. We also developed a web server which predicts FAD interacting residues in a protein which is freely available for academics

Crossref

Springer - Publisher Connector

PubMed Central

Sequence-based identification of interface residues by an integrative profile combining hydrophobic and evolutionary information

Author: A Porollo
AJ Bordner
B Wang
B Wang
BD Alberts
C Cortes
C Sander
ED Levy
F Glaser
F Pazos
H Chen
H Zhou
HM Berman
HS Wong
I Ezkurdia
I Res
J Chung
J Janin
J Kittler
J Kyte
J Mihel
JC Bezdek
Jinyan Li
JR Bradford
JR Bradford
KS Thorn
L Lo Conte
LI Kuncheva
LK Hansen
M Charton
M Guharoy
M Sikic
N H
P Baldi
P Chakrabarti
P Chen
P Cherepanov
P Cherepanov
P Fariselli
Peng Chen
Q Dong
R Singh
RA Laskowski
RD Pascual-Marqui
RM Kini
RP Bahadur
RP Bahadur
S Jones
S Jones
S Jones
SJ de Vries
T Friedrich
T Kohonen
TA Larsen
TJ Bollenbach
Uni-Prot-Consortium
V Chelliah
W Kauzmann
X Du
X Gallet
XW Chen
Y Murakami
Y Ofran
Y Ofran
Y Ofran
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background Protein-protein interactions play essential roles in protein function determination and drug design. Numerous methods have been proposed to recognize their interaction sites, however, only a small proportion of protein complexes have been successfully resolved due to the high cost. Therefore, it is important to improve the performance for predicting protein interaction sites based on primary sequence alone. Results We propose a new idea to construct an integrative profile for each residue in a protein by combining its hydrophobic and evolutionary information. A support vector machine (SVM) ensemble is then developed, where SVMs train on different pairs of positive (interface sites) and negative (non-interface sites) subsets. The subsets having roughly the same sizes are grouped in the order of accessible surface area change before and after complexation. A self-organizing map (SOM) technique is applied to group similar input vectors to make more accurate the identification of interface residues. An ensemble of ten-SVMs achieves an MCC improvement by around 8% and F1 improvement by around 9% over that of three-SVMs. As expected, SVM ensembles constantly perform better than individual SVMs. In addition, the model by the integrative profiles outperforms that based on the sequence profile or the hydropathy scale alone. As our method uses a small number of features to encode the input vectors, our model is simpler, faster and more accurate than the existing methods. Conclusions The integrative profile by combining hydrophobic and evolutionary information contributes most to the protein-protein interaction prediction. Results show that evolutionary context of residue with respect to hydrophobicity makes better the identification of protein interface residues. In addition, the ensemble of SVM classifiers improves the prediction performance. Availability Datasets and software are available at <url>http://mail.ustc.edu.cn/~bigeagle/BMCBioinfo2010/index.htm</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

OPUS - University of Technology Sydney

PubMed Central

Joint Evolutionary Trees: A Large-Scale Method To Predict Protein Interfaces Based on Sequence Sampling

Author: A Armon
A Prlic
Alessandra Carbone
BW Matthews
CA Innis
CA Innis
CJ Tsai
CT Porter
DR Caffrey
E Kanamori
ELL Sonnhammer
G Cheng
GH Gonnet
H Chen
I Mihalek
JA Studier
JR Bradford
Ladislas A. Trojan
Michael Levitt
O Lichtarge
O Lichtarge
P Chakrabarti
Richard Lavery
RP Bahadur
S Henikoff
S Jones
S Madabushi
S Miller
SF Altschul
SJ Hubbard
Sophie Sacquin-Mora
SS Negi
Stefan Engelen
T Pupko
W Humphrey
WSJ Valdar
Y Ofran
Y Ofran
ZJ Hu
Publication venue: Public Library of Science
Publication date: 01/01/2009
Field of study

The Joint Evolutionary Trees (JET) method detects protein interfaces, the core residues involved in the folding process, and residues susceptible to site-directed mutagenesis and relevant to molecular recognition. The approach, based on the Evolutionary Trace (ET) method, introduces a novel way to treat evolutionary information. Families of homologous sequences are analyzed through a Gibbs-like sampling of distance trees to reduce effects of erroneous multiple alignment and impacts of weakly homologous sequences on distance tree construction. The sampling method makes sequence analysis more sensitive to functional and structural importance of individual residues by avoiding effects of the overrepresentation of highly homologous sequences and improves computational efficiency. A carefully designed clustering method is parametrized on the target structure to detect and extend patches on protein surfaces into predicted interaction sites. Clustering takes into account residues' physical-chemical properties as well as conservation. Large-scale application of JET requires the system to be adjustable for different datasets and to guarantee predictions even if the signal is low. Flexibility was achieved by a careful treatment of the number of retrieved sequences, the amino acid distance between sequences, and the selective thresholds for cluster identification. An iterative version of JET (iJET) that guarantees finding the most likely interface residues is proposed as the appropriate tool for large-scale predictions. Tests are carried out on the Huang database of 62 heterodimer, homodimer, and transient complexes and on 265 interfaces belonging to signal transduction proteins, enzymes, inhibitors, antibodies, antigens, and others. A specific set of proteins chosen for their special functional and structural properties illustrate JET behavior on a large variety of interactions covering proteins, ligands, DNA, and RNA. JET is compared at a large scale to ET and to Consurf, Rate4Site, siteFiNDER|3D, and SCORECONS on specific structures. A significant improvement in performance and computational efficiency is shown

Crossref

HAL-Inserm

Directory of Open Access Journals

PubMed Central

Scoring docking conformations using predicted protein interfaces

Author: A Brückner
A Porollo
ADJ Van Dijk
AMJJ Bonvin
AW Ghoorah
B Huang
B Li
B Pierce
C Dominguez
C Yan
D Douguet
D Kozakov
D Kozakov
D La
DW Ritchie
DW Ritchie
E Mashiach
G Kuzu
GR Smith
GR Smith
H Chen
H Hwang
H Hwang
H Neuvirth
HM Berman
HX Zhou
HX Zhou
I Halperin
IA Vakser
J Fernández-Recio
J Fernández-Recio
J Janin
J Janin
J Mintseris
J Pande
JD Thompson
Jean-Christophe Nebel
JJ Gray
JL Chung
JR Bradford
K Ethan
K Henrick
K Krawczyk
K Venkatesan
KE Gottschalk
L Li
LC Xue
LC Xue
M Ohh
M Tyagi
M Šikić
MF Lensink
MF Lensink
MG Kann
N Andrusier
N Zhao
OG Othersen
P Baldi
P Chen
P Kuo
PH Patel
PJ Kundrotas
QC Zhang
QC Zhang
QC Zhang
R Chen
R Esmaielbeiki
R Esmaielbeiki
R Grünberg
R Khashan
RA Jordan
Reyhaneh Esmaielbeiki
S Jones
S Liang
S Qin
S Qin
SF Altschul
SJ De Vries
SJ De Vries
SJ Fleishman
SR Comeau
T Fawcett
T Vreven
Y Ofran
Y Ofran
Y Ofran
Y Ofran
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

BACKGROUND: Since proteins function by interacting with other molecules, analysis of protein-protein interactions is essential for comprehending biological processes. Whereas understanding of atomic interactions within a complex is especially useful for drug design, limitations of experimental techniques have restricted their practical use. Despite progress in docking predictions, there is still room for improvement. In this study, we contribute to this topic by proposing T-PioDock, a framework for detection of a native-like docked complex 3D structure. T-PioDock supports the identification of near-native conformations from 3D models that docking software produced by scoring those models using binding interfaces predicted by the interface predictor, Template based Protein Interface Prediction (T-PIP). RESULTS: First, exhaustive evaluation of interface predictors demonstrates that T-PIP, whose predictions are customised to target complexity, is a state-of-the-art method. Second, comparative study between T-PioDock and other state-of-the-art scoring methods establishes T-PioDock as the best performing approach. Moreover, there is good correlation between T-PioDock performance and quality of docking models, which suggests that progress in docking will lead to even better results at recognising near-native conformations. CONCLUSION: Accurate identification of near-native conformations remains a challenging task. Although availability of 3D complexes will benefit from template-based methods such as T-PioDock, we have identified specific limitations which need to be addressed. First, docking software are still not able to produce native like models for every target. Second, current interface predictors do not explicitly consider pairwise residue interactions between proteins and their interacting partners which leaves ambiguity when assessing quality of complex conformations

Crossref

Springer - Publisher Connector

PubMed Central

Kingston University Research Repository

Protein-Protein Interaction Site Predictions with Three-Dimensional Probability Distributions of Interacting Atoms on Protein Surfaces

Author: A Koike
A Porollo
AA Bogan
An-Suei Yang
Attila Gursoy
BJ McConkey
BW Matthews
CC Chang
CD Manning
Ching-Tai Chen
CJC Burges
CM Yu
CT Chen
DE Rumelhart
DT Chang
ED Levy
Ei-Wen Yang
F Glaser
F Rodier
FB Sheinerman
G Moont
H Neuvirth
Hung-Pin Peng
HX Zhou
I Ezkurdia
I Kufareva
I Res
IS Moreira
J Janin
Jeng-Yih Chang
Jhih-Wei Jian
JM Elkins
Jun-Bo Chen
K Henrick
K Levenberg
Keng-Chang Tsai
L Breiman
L Jiang
L Lo Conte
M Reidmiller
M Riedmiller
M Sikic
MH Li
MN Wass
MN Wass
N Tuncbag
O Keskin
P Chakrabarti
PJ Kundrotas
QC Zhang
QC Zhang
RA Laskowski
S Engelen
S Jones
S Sacquin-Mora
Shinn-Ying Ho
SJ de Vries
SJ Hubbard
SS Negi
Wen-Lian Hsu
X Gallet
Y Murakami
Y Murakami
Y Ofran
Y Ofran
Y Ofran
Publication venue: Public Library of Science
Publication date: 06/06/2012
Field of study

Protein-protein interactions are key to many biological processes. Computational methodologies devised to predict protein-protein interaction (PPI) sites on protein surfaces are important tools in providing insights into the biological functions of proteins and in developing therapeutics targeting the protein-protein interaction sites. One of the general features of PPI sites is that the core regions from the two interacting protein surfaces are complementary to each other, similar to the interior of proteins in packing density and in the physicochemical nature of the amino acid composition. In this work, we simulated the physicochemical complementarities by constructing three-dimensional probability density maps of non-covalent interacting atoms on the protein surfaces. The interacting probabilities were derived from the interior of known structures. Machine learning algorithms were applied to learn the characteristic patterns of the probability density maps specific to the PPI sites. The trained predictors for PPI sites were cross-validated with the training cases (consisting of 432 proteins) and were tested on an independent dataset (consisting of 142 proteins). The residue-based Matthews correlation coefficient for the independent test set was 0.423; the accuracy, precision, sensitivity, specificity were 0.753, 0.519, 0.677, and 0.779 respectively. The benchmark results indicate that the optimized machine learning models are among the best predictors in identifying PPI sites on protein surfaces. In particular, the PPI site prediction accuracy increases with increasing size of the PPI site and with increasing hydrophobicity in amino acid composition of the PPI interface; the core interface regions are more likely to be recognized with high prediction confidence. The results indicate that the physicochemical complementarity patterns on protein surfaces are important determinants in PPIs, and a substantial portion of the PPI sites can be predicted correctly with the physicochemical complementarity features based on the non-covalent interaction data derived from protein interiors

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Application of amino acid occurrence for discriminating different folding types of globular proteins

Author: AG Murzin
H Zhou
HB Shen
HB Shen
HQD Ding
J Cheng
J Shi
KC Chou
KC Chou
M Michael Gromiha
MM Gromiha
MM Gromiha
MM Gromiha
MM Gromiha
MM Gromiha
MM Gromiha
MM Gromiha
P Klein
QS Du
R Development Core Team
T Hirokawa
TS Kumarevel
WS Bu
Y Ofran
Y-h Taguchi
YD Cai
ZZ Wang
Publication venue: BioMed Central
Publication date: 01/10/2007
Field of study

Abstract Background Predicting the three-dimensional structure of a protein from its amino acid sequence is a long-standing goal in computational/molecular biology. The discrimination of different structural classes and folding types are intermediate steps in protein structure prediction. Results In this work, we have proposed a method based on linear discriminant analysis (LDA) for discriminating 30 different folding types of globular proteins using amino acid occurrence. Our method was tested with a non-redundant set of 1612 proteins and it discriminated them with the accuracy of 38%, which is comparable to or better than other methods in the literature. A web server has been developed for discriminating the folding type of a query protein from its amino acid sequence and it is available at http://granular.com/PROLDA/. Conclusion Amino acid occurrence has been successfully used to discriminate different folding types of globular proteins. The discrimination accuracy obtained with amino acid occurrence is better than that obtained with amino acid composition and/or amino acid properties. In addition, the method is very fast to obtain the results.</p

Crossref

Directory of Open Access Journals

PubMed Central

Prediction of protein binding sites in protein structures using hidden Markov support vector machine

Author: A Henschel
A Koike
A Kouranov
A Porollo
A Rossi
AJ Bordner
B Wang
Bin Liu
Buzhou Tang
C Chothia
C Yan
C Yan
C-T Chen
C-W Cheng
H Chen
H Kim
H Neuvirth
H-X Zhou
HX Zhou
I Ezkurdia
I Res
I Tsochantaridis
I Tsochantaridis
J Lafferty
J Song
J Song
J-L Chung
JD Fischer
JL Chung
JR Bradford
JW Torrance
K Henrick
L Holm
L Lo Conte
L Wang
Lei Lin
LR Rabiner
M Gribskov
M Vincent
M Šikić
MH Li
N Li
NJ Burgoyne
P Fariselli
Q Dong
Qiwen Dong
S Ahmad
S Liang
S Qin
SF Altschul
SF Altschul
T Joachims
T Zhang
TH Dang
W Kabsch
WK Kim
X-w Chen
Xiaolong Wang
Xuan Wang
Y Altun
Y Liu
Y Ofran
Y Ofran
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Predicting the binding sites between two interacting proteins provides important clues to the function of a protein. Recent research on protein binding site prediction has been mainly based on widely known machine learning techniques, such as artificial neural networks, support vector machines, conditional random field, etc. However, the prediction performance is still too low to be used in practice. It is necessary to explore new algorithms, theories and features to further improve the performance. Results In this study, we introduce a novel machine learning model hidden Markov support vector machine for protein binding site prediction. The model treats the protein binding site prediction as a sequential labelling task based on the maximum margin criterion. Common features derived from protein sequences and structures, including protein sequence profile and residue accessible surface area, are used to train hidden Markov support vector machine. When tested on six data sets, the method based on hidden Markov support vector machine shows better performance than some state-of-the-art methods, including artificial neural networks, support vector machines and conditional random field. Furthermore, its running time is several orders of magnitude shorter than that of the compared methods. Conclusion The improved prediction performance and computational efficiency of the method based on hidden Markov support vector machine can be attributed to the following three factors. Firstly, the relation between labels of neighbouring residues is useful for protein binding site prediction. Secondly, the kernel trick is very advantageous to this field. Thirdly, the complexity of the training step for hidden Markov support vector machine is linear with the number of training samples by using the cutting-plane algorithm.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS

Interaction site prediction by structural similarity to neighboring clusters in protein-protein interaction networks

Author: A Koike
A Porollo
A Sacan
A Thomas
A Vazquez
AJ Bordner
B Huang
B Schwikowski
DW Ritchie
GD Bader
H Hishigaki
Hiroyuki Monji
HX Zhou
HX Zhou
I Res
I Xenarios
J Dundas
JR Bradford
K Kinoshita
L Salwinski
M Deng
P Fariselli
P Uetz
RA Laskowski
S Jones
S Letovsky
S Peri
Satoshi Koizumi
T Ito
Takenao Ohkawa
Tomonobu Ozaki
Y Kaneta
Y Ofran
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Recently, revealing the function of proteins with protein-protein interaction (PPI) networks is regarded as one of important issues in bioinformatics. With the development of experimental methods such as the yeast two-hybrid method, the data of protein interaction have been increasing extremely. Many databases dealing with these data comprehensively have been constructed and applied to analyzing PPI networks. However, few research on prediction interaction sites using both PPI networks and the 3D protein structures complementarily has explored. Results We propose a method of predicting interaction sites in proteins with unknown function by using both of PPI networks and protein structures. For a protein with unknown function as a target, several clusters are extracted from the neighboring proteins based on their structural similarity. Then, interaction sites are predicted by extracting similar sites from the group of a protein cluster and the target protein. Moreover, the proposed method can improve the prediction accuracy by introducing repetitive prediction process. Conclusions The proposed method has been applied to small scale dataset, then the effectiveness of the method has been confirmed. The challenge will now be to apply the method to large-scale datasets.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Exploiting residue-level and profile-level interface propensities for usage in binding sites prediction of proteins

Author: A Dubey
A Koike
A Rossi
AH Liu
AJ Bordner
AJ Bordner
AR Panchenko
AT Laurie
B Pils
B Thibert
B Wang
B Wilczynski
C Sander
C Yan
C Yan
C Zhang
CC Chang
D La
DH Morgan
F Osterberg
G Cheng
H Chen
H Deng
H Neuvirth
H Yao
H Yao
HX Zhou
I Res
I Xenarios
IM Nooren
IM Nooren
J Meiler
JL Chung
JR Bradford
JR Bradford
JW Torrance
K Henrick
KA Snyder
L Lo Conte
Lei Lin
MH Li
O Lichtarge
P Chakrabarti
Q Dong
Qiwen Dong
Qw Dong
QW Dong
S Jones
S Karlin
S Liang
SF Altschul
T Down
TJ Magliery
V Chelliah
VN Vapnik
W Kabsch
WS Valdar
WS Valdar
Xiaolong Wang
Y Kim
Y Ofran
Y Ofran
Yi Guan
Z Zhang
Publication venue: BioMed Central
Publication date: 01/05/2007
Field of study

Abstract Background Recognition of binding sites in proteins is a direct computational approach to the characterization of proteins in terms of biological and biochemical function. Residue preferences have been widely used in many studies but the results are often not satisfactory. Although different amino acid compositions among the interaction sites of different complexes have been observed, such differences have not been integrated into the prediction process. Furthermore, the evolution information has not been exploited to achieve a more powerful propensity. Result In this study, the residue interface propensities of four kinds of complexes (homo-permanent complexes, homo-transient complexes, hetero-permanent complexes and hetero-transient complexes) are investigated. These propensities, combined with sequence profiles and accessible surface areas, are inputted to the support vector machine for the prediction of protein binding sites. Such propensities are further improved by taking evolutional information into consideration, which results in a class of novel propensities at the profile level, i.e. the binary profiles interface propensities. Experiment is performed on the 1139 non-redundant protein chains. Although different residue interface propensities among different complexes are observed, the improvement of the classifier with residue interface propensities can be negligible in comparison with that without propensities. The binary profile interface propensities can significantly improve the performance of binding sites prediction by about ten percent in term of both precision and recall. Conclusion Although there are minor differences among the four kinds of complexes, the residue interface propensities cannot provide efficient discrimination for the complicated interfaces of proteins. The binary profile interface propensities can significantly improve the performance of binding sites prediction of protein, which indicates that the propensities at the profile level are more accurate than those at the residue level.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central